432 research outputs found

    NPLDA: A Deep Neural PLDA Model for Speaker Verification

    Full text link
    The state-of-art approach for speaker verification consists of a neural network based embedding extractor along with a backend generative model such as the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose a neural network approach for backend modeling in speaker recognition. The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost. The proposed model, termed as neural PLDA (NPLDA), is initialized using the generative PLDA model parameters. The loss function for the NPLDA model is an approximation of the minimum detection cost function (DCF). The speaker recognition experiments using the NPLDA model are performed on the speaker verificiation task in the VOiCES datasets as well as the SITW challenge dataset. In these experiments, the NPLDA model optimized using the proposed loss function improves significantly over the state-of-art PLDA based speaker verification system.Comment: Published in Odyssey 2020, the Speaker and Language Recognition Workshop (VOiCES Special Session). Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda. arXiv admin note: substantial text overlap with arXiv:2001.0703

    Vibrappo

    Get PDF
    Architecture & Allied Art

    Neural PLDA Modeling for End-to-End Speaker Verification

    Full text link
    While deep learning models have made significant advances in supervised classification problems, the application of these models for out-of-set verification tasks like speaker recognition has been limited to deriving feature embeddings. The state-of-the-art x-vector PLDA based speaker verification systems use a generative model based on probabilistic linear discriminant analysis (PLDA) for computing the verification score. Recently, we had proposed a neural network approach for backend modeling in speaker verification called the neural PLDA (NPLDA) where the likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost. In this paper, we extend this work to achieve joint optimization of the embedding neural network (x-vector network) with the NPLDA network in an end-to-end (E2E) fashion. This proposed end-to-end model is optimized directly from the acoustic features with a verification cost function and during testing, the model directly outputs the likelihood ratio score. With various experiments using the NIST speaker recognition evaluation (SRE) 2018 and 2019 datasets, we show that the proposed E2E model improves significantly over the x-vector PLDA baseline speaker verification system.Comment: Accepted in Interspeech 2020. GitHub Implementation Repos: https://github.com/iiscleap/E2E-NPLDA and https://github.com/iiscleap/NeuralPld

    Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation

    Full text link
    Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images. Using coordinates can easily model the absolute position of each token, but they might be sensitive to manipulations in document images (e.g., shifting, rotation or scaling), especially when the training data is limited in few-shot settings. In this paper, we propose to further introduce the topological adjacency relationship among the tokens, emphasizing their relative position information. Specifically, we consider the tokens in the documents as nodes and formulate the edges based on the topological heuristics from the k-nearest bounding boxes. Such adjacency graphs are invariant to affine transformations including shifting, rotations and scaling. We incorporate these graphs into the pre-trained language model by adding graph neural network layers on top of the language model embeddings, leading to a novel model LAGER. Extensive experiments on two benchmark datasets show that LAGER significantly outperforms strong baselines under different few-shot settings and also demonstrate better robustness to manipulations

    Activate or inhibit? Implications of autophagy modulation as a therapeutic strategy for Alzheimer’s disease

    Get PDF
    Neurodegenerative diseases result in a range of conditions depending on the type of proteinopathy, genes affected or the location of the degeneration in the brain. Proteinopathies such as senile plaques and neurofibrillary tangles in the brain are prominent features of Alzheimer’s disease (AD). Autophagy is a highly regulated mechanism of eliminating dysfunctional organelles and proteins, and plays an important role in removing these pathogenic intracellular protein aggregates, not only in AD, but also in other neurodegenerative diseases. Activating autophagy is gaining interest as a potential therapeutic strategy for chronic diseases featuring protein aggregation and misfolding, including AD. Although autophagy activation is a promising intervention, over-activation of autophagy in neurodegenerative diseases that display impaired lysosomal clearance may accelerate pathology, suggesting that the success of any autophagy-based intervention is dependent on lysosomal clearance being functional. Additionally, the effects of autophagy activation may vary significantly depending on the physiological state of the cell, especially during proteotoxic stress and ageing. Growing evidence seems to favour a strategy of enhancing the efficacy of autophagy by preventing or reversing the impairments of the specific processes that are disrupted. Therefore, it is essential to understand the underlying causes of the autophagy defect in different neurodegenerative diseases to explore possible therapeutic approaches. This review will focus on the role of autophagy during stress and ageing, consequences that are linked to its activation and caveats in modulating this pathway as a treatment

    Hyperon bulk viscosity and rr-modes of neutron stars

    Full text link
    We propose and apply a new parameterization of the modified chiral effective model to study rotating neutron stars with hyperon cores in the framework of the relativistic mean-field theory. The inclusion of mesonic cross couplings in the model has improved the density content of the symmetry energy slope parameters, which are in agreement with the findings from recent terrestrial experiments. The bulk viscosity of the hyperonic medium is analyzed to investigate its role in the suppression of gravitationally driven rr-modes. The hyperonic bulk viscosity coefficient caused by non-leptonic weak interactions and the corresponding damping timescales are calculated and the rr-mode instability windows are obtained. The present model predicts a significant reduction of the unstable region due to a more effective damping of oscillations. We find that from ∼108\sim 10^8 K to ∼109\sim 10^{9} K, hyperonic bulk viscosity completely suppresses the rr-modes leading to a stable region between the instability windows. Our analysis indicates that the instability can reduce the angular velocity of the star up to ∼\sim0.3~ΩK\Omega_K, where ΩK\Omega_K is the Kepler frequency of the star.Comment: 9 pages, 9 figures; Accepted for publication in MNRA
    • …
    corecore